This script analyzes data from two active/passive learning experiments.

Libraries

Load data

Rename conditions and reorder levels of the condition factor.

Exclusionary criteria

Remove participants who reported misunderstanding the task

Descriptives

condition order category_type mean_exp_length sd_exp_length count participants_needed
AA order1 information-integration 9.931370 2.4703157 11 9
AA order1 rule-based 11.625209 8.5238214 67 -47
AA order2 information-integration 11.746222 5.9075187 12 8
AA order2 rule-based 18.602295 7.1491220 14 6
RR order1 information-integration 3.457430 1.4751474 9 11
RR order1 rule-based 11.843223 17.3986276 29 -9
RR order2 information-integration 3.078767 0.6023572 14 6
RR order2 rule-based 11.421539 7.4618685 16 4
RA order1 information-integration 7.722699 3.6089647 24 -4
RA order1 rule-based 6.037413 2.0833147 17 3
RA order2 information-integration 5.715281 1.7631412 18 2
RA order2 rule-based 5.509832 2.0370340 18 2
AR order1 information-integration 6.092907 1.2644156 12 8
AR order1 rule-based 7.680843 5.1584244 14 6
AR order2 information-integration 6.508570 1.5896296 19 1
AR order2 rule-based 6.434325 2.8250577 25 -5
YY order1 information-integration 4.069612 1.0631784 10 10
YY order1 rule-based 4.799982 1.7901079 20 0
YY order2 information-integration 5.415993 2.3809064 8 12
YY order2 rule-based 4.090271 0.7830975 4 16

Histogram of length of experiment split by condition

Overall accuracy analysis

Get mean accuracy for each condition and category type

Plot.

Accuracy by block analysis

Next, we analyze accuracy across the two blocks.

Plot.

The block analysis suggests some effect of order on active learning. Receptive-first learners appear to be more accurate after their block of active learning (block 2) compared to Active-first (block 1).

But, the effect is not large enough to overcome the overall active advantage that shows up in block 1 for the Active-first learners.

Accuracy by block and order

Order here refers to whether size or angle was the category dimension.

Rename order labels, so they make sense

Plot accuracy over blocks

Rule-Based category structure

For the category that depends on size, AA and RA end up on top of each other, whereas AR and RR do not. I’m not sure what’s going on with the “angle” category – Perhaps this is just easier to learn overall and so we are not seeing any condtion differences?

Also, there seems to be some between subjects variation here – could this explain why the RR learners are the best in the angle category? Should we try to replicate this order difference?

Information Integration category structure

Evidence selection analysis (active learning)

Analyze the average distance of participants’ samples from the optimal decision boundary.

Rotate, so orientation and radius are on the same dimension.

Plot group level sampling behavior.

Plot individual participant sampling behavior

Get distance from optimal decision boundary for each sample.

Now get the average distance across subjects

Plot.

Active learning is better after getting a block of receptive learning trials. But not better than getting two blocks of Active learning trials.

Relationship between sampling and test

Get the mean sample distance and accuracy for each participant.

Plot

Individual accuracy across blocks: consistency analysis

Plot.

There is a different overall pattern of accuracy performance across blocks by condition. Active-first learners have smaller slope compared to Receptive-first learners.

There are a couple of participants doing weird things – huge drop in accuracy in block 2, but it shows up in both conditions.

Maybe there are some other analyses to do at the individual participant level?

Statistics

Accuracy on active learning block is trending towards a reliable difference. (Is marginally significant if you just look at Order 1, size condition).

Models

Accuracy on the trial-level based on condition and block

Does condition and block predict accuracy on test trials?

## Generalized linear mixed model fit by maximum likelihood (Adaptive
##   Gauss-Hermite Quadrature, nAGQ = 0) [glmerMod]
##  Family: binomial  ( logit )
## Formula: correct ~ condition * block_factor * category_type + (1 | subids)
##    Data: filter(df, trial_type == "test", condition != "YY")
## Control: glmerControl(optimizer = "bobyqa")
## 
##      AIC      BIC   logLik deviance df.resid 
##  25028.4  25165.9 -12497.2  24994.4    23990 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -5.5350 -0.8352  0.3613  0.6274  1.9934 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  subids (Intercept) 0.7501   0.8661  
## Number of obs: 24007, groups:  subids, 365
## 
## Fixed effects:
##                                                   Estimate Std. Error
## (Intercept)                                        0.76072    0.18258
## conditionRR                                       -0.49394    0.24940
## conditionRA                                       -0.17368    0.22545
## conditionAR                                       -0.23854    0.23490
## block_factor2                                      0.13495    0.10611
## category_typerule-based                            0.71407    0.20801
## conditionRR:block_factor2                          0.10591    0.14603
## conditionRA:block_factor2                          0.05830    0.13021
## conditionAR:block_factor2                         -0.00980    0.13571
## conditionRR:category_typerule-based                0.02952    0.28870
## conditionRA:category_typerule-based                0.11177    0.28316
## conditionAR:category_typerule-based                0.10707    0.29173
## block_factor2:category_typerule-based              0.40867    0.13069
## conditionRR:block_factor2:category_typerule-based -0.01663    0.18626
## conditionRA:block_factor2:category_typerule-based  0.02139    0.18557
## conditionAR:block_factor2:category_typerule-based -0.29260    0.18178
##                                                   z value Pr(>|z|)    
## (Intercept)                                         4.167 3.09e-05 ***
## conditionRR                                        -1.980 0.047649 *  
## conditionRA                                        -0.770 0.441083    
## conditionAR                                        -1.015 0.309869    
## block_factor2                                       1.272 0.203424    
## category_typerule-based                             3.433 0.000597 ***
## conditionRR:block_factor2                           0.725 0.468306    
## conditionRA:block_factor2                           0.448 0.654348    
## conditionAR:block_factor2                          -0.072 0.942434    
## conditionRR:category_typerule-based                 0.102 0.918554    
## conditionRA:category_typerule-based                 0.395 0.693042    
## conditionAR:category_typerule-based                 0.367 0.713614    
## block_factor2:category_typerule-based               3.127 0.001766 ** 
## conditionRR:block_factor2:category_typerule-based  -0.089 0.928838    
## conditionRA:block_factor2:category_typerule-based   0.115 0.908248    
## conditionAR:block_factor2:category_typerule-based  -1.610 0.107474    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) cndtRR cndtRA cndtAR blck_2 ctgr_- cnRR:_2 cnRA:_2
## conditionRR -0.732                                                   
## conditionRA -0.810  0.593                                            
## conditionAR -0.777  0.569  0.629                                     
## block_fctr2 -0.283  0.207  0.229  0.220                              
## ctgry_typr- -0.878  0.643  0.711  0.682  0.249                       
## cndtnRR:b_2  0.206 -0.285 -0.167 -0.160 -0.727 -0.181                
## cndtnRA:b_2  0.231 -0.169 -0.281 -0.179 -0.815 -0.203  0.592         
## cndtnAR:b_2  0.221 -0.162 -0.179 -0.283 -0.782 -0.194  0.568   0.637 
## cndtnRR:c_-  0.632 -0.864 -0.512 -0.492 -0.179 -0.691  0.246   0.146 
## cndtnRA:c_-  0.645 -0.472 -0.796 -0.501 -0.183 -0.716  0.133   0.224 
## cndtnAR:c_-  0.626 -0.429 -0.507 -0.805 -0.177 -0.713  0.129   0.144 
## blck_fc2:_-  0.230 -0.168 -0.186 -0.179 -0.812 -0.290  0.590   0.662 
## cndRR:_2:_- -0.161  0.223  0.131  0.125  0.570  0.203 -0.784  -0.464 
## cndRA:_2:_- -0.162  0.119  0.197  0.126  0.572  0.204 -0.415  -0.702 
## cndAR:_2:_- -0.165  0.121  0.134  0.211  0.584  0.209 -0.424  -0.476 
##             cnAR:_2 cRR:_- cRA:_- cAR:_- b_2:_- cRR:_2: cRA:_2:
## conditionRR                                                    
## conditionRA                                                    
## conditionAR                                                    
## block_fctr2                                                    
## ctgry_typr-                                                    
## cndtnRR:b_2                                                    
## cndtnRA:b_2                                                    
## cndtnAR:b_2                                                    
## cndtnRR:c_-  0.140                                             
## cndtnRA:c_-  0.143   0.510                                     
## cndtnAR:c_-  0.227   0.468  0.510                              
## blck_fc2:_-  0.635   0.209  0.213  0.207                       
## cndRR:_2:_- -0.445  -0.296 -0.150 -0.145 -0.702                
## cndRA:_2:_- -0.447  -0.147 -0.295 -0.146 -0.704  0.494         
## cndAR:_2:_- -0.747  -0.150 -0.153 -0.293 -0.719  0.504   0.506

Reliable interaction between condition and block. Receptive-first learners perform better on the second block of test trials than Active-first learners.

But overall, the two groups are not different from one another. How to interpret?

Accuracy based on sampling behavior and condition

Does mean accuracy depend on sampling behavior and condition?

Reliable interaction between mean sample distance and condition. If you get Receptive-first, then better sampling predicts better test, but not if you get Active-first.

Sampling behavior based on condition

Which condition is “better” at sampling?

Receptive-first participants are better at sampling than active first participants.

Effect coding (condition vs. category type) to test main effects.

effect code (choose contrasts based on how you want to interpret model output)

Model with effect coding.

Intercept is the mean of the means (or the grand mean) of all the groups. These data are unbalanced. Active better than passive. Information integration worse than rule-based.